Characteristic Substructures and Properties in Chemical Carcinogens Studied by the Cascade Model
نویسنده
چکیده
MOTIVATION Chemical carcinogenicity is an important subject in health and environmental sciences, and a reliable method is expected to identify characteristic factors for carcinogenicity. The predictive toxicology challenge (PTC) 2000-2001 has provided the opportunity for various data mining methods to evaluate their performance. The cascade model, a data mining method developed by the author, has the capability to mine for local correlations in data sets with a large number of attributes. The current paper explores the effectiveness of the method on the problem of chemical carcinogenicity. RESULTS Rodent carcinogenicity of 417 compounds examined by the National Toxicology Program (NTP) was used as the training set. The analysis by the cascade model, for example, could obtain a rule 'Highly flexible molecules are carcinogenic, if they have no hydrogen bond acceptors in halogenated alkanes and alkenes'. Resulting rules are applied to predict the activity of 185 compounds examined by the FDA. The ROC analysis performed by the PTC organizers has shown that the current method has excellent predictive power for the female rat data. AVAILABILITY The binary program of DISCAS 2.1 and samples of input data sets on Windows PC are available at http://www.clab.kwansei.ac.jp/mining/discas/discas.html upon request from the author. SUPPLEMENTARY INFORMATION Summary of prediction results and cross validations is accessible via http://www.clab.kwansei.ac.jp/~okada/BIJ/BIJsupple.htm. Used rules and the prediction results for each molecule are also provided.
منابع مشابه
Characteristic Substructures and Properties in the Chemical Carcinogenicity Studied by the Cascade Model
The cascade model is a rule induction methodology using the levelwise expansion of the lattice. An attribute-value pair is expressed as an item, and every node in the lattice is specified by an itemset and by its supporting instances. If the distribution of the class attribute values shows a large change along a link in the lattice, the link is represented as a rule "IF added-item-along-link ad...
متن کاملModulation Response and Relative Intensity Noise Spectra in Quantum Cascade Lasers
Static properties, relatively intensity noise and intensity modulation response in quantum cascade lasers (QCLs) studied theoretically in this paper. The present rate equations model consists of three equations for the electrons density in the conduction band and one equation for photons density in cavity length. Two equations were derived to calculate the noise and modulation response. Calcula...
متن کاملMonitoring the censored lognormal reliability data in a three-stage process using AFT model
Improving the product reliability is the main concern in both manufacturing and service processes which is obtained by monitoring the reliability-related quality characteristics. Nowadays, products or services are the result of processes with dependent stages referred to as multistage processes. In these processes, the quality characteristic in each stage is affected by the quality characterist...
متن کاملStructure-activity models of chemical carcinogens: state of the art, and new directions.
Chemical carcinogenicity has been the target of numerous attempts to create predictive models alternative to the animal ones, ranging from short-term biological assays (e.g. mutagenicity tests) to theoretical models. Among the theoretical models, the application of the science of structure-activity relationships (SAR) has earned special prominence. The qualitative approach to SAR has lead to th...
متن کاملA Numerical Investigation on the Unstable Flow in a Single Stage of an Axial Compressor
An unsteady two-dimensional finite-volume solver was developed based on Van Leer’s flux splitting algorithm in conjunction with “Monotonic Upstream Scheme for Conservation Laws (MUSCL)” limiters to improve the order of accuracy and the two-layer Baldwin-Lomax turbulence model was also implemented. Two test cases were prepared to validate the solver. The computed results were compared with the e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 19 10 شماره
صفحات -
تاریخ انتشار 2003